| Benjamin Geer on 20 Nov 2000 09:13:05 -0000 | 
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
| Re: <nettime> Asia and domain names, etc. (@) | 
On Fri, Nov 17, 2000 at 12:25:30AM +0900, david@2dk.net wrote: > Unicode, as I understand it, is a project to develop a global > localisation standard -- a way to learn how to write one(=uni) > source code that will be expedient to localise for any market. Well, not really. It's actually a single character encoding that includes all the characters of nearly all known languages that have writing systems. > This is a technical issue for software manufacturers who wish to > become multinationals, and not one for finding universal ways of > integrating living languages onto 'the' net. I think you've misunderstood it; it's the latter. A Japanese, Thai, Russian, or English document that's encoded in Unicode (as opposed to one of the many older, language-specific encodings) can be displayed without modification in any web browser or word processor that understands Unicode and has a Unicode font. This is why all of Microsoft's products, the Java platform, and the XML standard use Unicode as their underlying encoding. It completely removes the need to `translate' documents from one encoding into another in order to display them in an environment other than the one they were written in. > ISO 10646 is an international standard in that somebody recognises > that there is an issue here. It isn't a functioning initiative that > has been actually globally adopted. It's been adopted by every major software manufacturer, and my impression is that it's pretty well-supported on most operating systems. To the best of my knowledge, if you use a recent version of Microsoft Word, you're writing documents in Unicode. > But I do not know your work, or your affiliation to these > initiatives. I'm not affiliated with them, but nowadays they seem to affect the work of every programmer. > I, with my Japanese system have immense problems sending the exact > same 'chinese' characters (though I also have a PRC chinese > character OS which I can reboot into) to my friends in Korea or > Taiwan. This is not a Unicode problem, nor anything that it will > solve in the forseeable future. Unicode means that all of us in > these various countries may be attempting to send these files in > various localised versions of MSWord which all function well in our > markets. Not at all. In fact, that's exactly the problem Unicode is meant to solve. Localisation and encoding are basically separate issues. Localisation means, for example, that the a menu is marked `File' in the English version of MS Word, and `Fichier' in the French version. Encoding, on the other hand, is the way characters are represented as bytes in the document produced. The idea of Unicode is to enable every language to use the same encoding; therefore, you should be able to use any Unicode-compliant version of MS Word, regardless of its localisation, to read a document containing, say, a mixture of Japanese, Hungarian, Korean, and Arabic. > (You should see what a nettime post sent from someone with a French > character set looks like when recieved on a double-byte OS. It's a > mess!!) The problem there is that French (along with some other European languages) is traditionally encoded in ISO-8859-1, and Japanese traditionally uses JIS or EUC. Most people on Nettime use ISO-8859-1 (probably without realising it). But if we all use Unicode-compliant mail readers, and we all write our French and Japanese emails in Unicode, everyone's emails will appear correctly on everyone else's computers. -- Benjamin Geer http://www.btinternet.com/~amisuk/bg # distributed via <nettime>: no commercial use without permission # <nettime> is a moderated mailing list for net criticism, # collaborative text filtering and cultural politics of the nets # more info: majordomo@bbs.thing.net and "info nettime-l" in the msg body # archive: http://www.nettime.org contact: nettime@bbs.thing.net